In coding theory, fountain codes (also known as rateless erasure codes) are a class of erasure codes with the property that a potentially limitless sequence of encoding symbols can be generated from a given set of source symbols such that the original source symbols can ideally be recovered from any subset of the encoding symbols of size equal to or only slightly larger than the number of source symbols. The term fountain or rateless refers to the fact that these codes do not exhibit a fixed code rate.
A fountain code is optimal if the original k source symbols can be recovered from any k successfully received encoding symbols (i.e., excluding those that were erased). Fountain codes are known that have efficient encoding and decoding and that allow the recovery of the original k source symbols from any k’ of the encoding symbols with high probability, where k’ is just slightly larger than k.
LT codes were the first practical realization of fountain codes. Raptor codes and online codes were subsequently introduced, and achieve linear time encoding and decoding complexity through a pre-coding stage of the input symbols. Triangular network coding achieve linear encoding and decoding using non-linear encoding, and decoding using the back-substitution method.
One example is that of a data carousel, where some large file is continuously broadcast to a set of receivers. Using a fixed-rate erasure code, a receiver missing a source symbol (due to a transmission error) faces the coupon collector's problem: it must successfully receive an encoding symbol which it does not already have. This problem becomes much more apparent when using a traditional short-length erasure code, as the file must be split into several blocks, each being separately encoded: the receiver must now collect the required number of missing encoding symbols for each block. Using a fountain code, it suffices for a receiver to retrieve any subset of encoding symbols of size slightly larger than the set of source symbols. (In practice, the broadcast is typically scheduled for a fixed period of time by an operator based on characteristics of the network and receivers and desired delivery reliability, and thus the fountain code is used at a code rate that is determined dynamically at the time when the file is scheduled to be broadcast.)
Another application is that of hybrid ARQ in reliable multicast scenarios: parity information that is requested by a receiver can potentially be useful for all receivers in the multicast group.
A more advanced Raptor code with greater flexibility and improved reception overhead, called RaptorQ, has been specified in IETF RFC 6330. The specified RaptorQ code can be used with up to 56,403 source symbols in a source block, and a total of up to 16,777,216 encoded symbols generated for a source block. This code is able to recover a source block from any set of encoded symbols equal to the number of source symbols in the source block with high probability, and in rare cases from slightly more than the number of source symbols in the source block. The RaptorQ code is an integral part of the ROUTE instantiation specified in ATSC A-331 (ATSC 3.0)
A different approach to distributed storage using fountain codes has been proposed in Liquid Cloud Storage. Liquid Cloud Storage is based on using a large erasure code such as the RaptorQ code specified in IETF RFC 6330 (which provides significantly better data protection than other systems), using a background repair process (which significantly reduces the repair bandwidth requirements compared to other systems), and using a stream data organization (which allows fast access to data even when not all encoded symbols are available).
|
|